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ABSTRACT 

In this paper, we study how the D-iteration algorithm can be 
applied to numerically solve the differential equations such 
as heat equation in 2D or 3D. The method can be applied on 
the class of problems that can be addressed by the Gauss- 
Seidel iteration, based on the linear approximation of the 
differential equations. 

Categories and Subject Descriptors 

G.1.3 [Mathematics of Computing]: Numerical Analy- 
sis — Numerical Linear Algebra 

General Terms 

Algorithms, Performance 

Keywords 

Numerical computation; Iteration; Fixed point; Gauss-Seidel. 

1. INTRODUCTION 

The iterative methods to solve differential equations based 
on the linear approximation are very well studied approaches 
[PPj , [J, [12], 0, [14], [3], [13]. The approach we propose 
here (D-iteration) is a new approach initially applied to nu- 
merically solve the eigenvector of the Pagerank type equa- 
tion m, 0, 0, SI, 0, HQ], 0- 

The D-iteration, as diffusion based iteration, is an iter- 
ation method that can be understood as a column-vector 
based iteration as opposed to a row-vector based approach. 
Jacobi and Gauss-Seidel iterations are good examples of 
row-vector based iteration schemes. While our approach 
can be associated to the diffusion vision, the existing ones 
can be associated to the collection vision. 

In this paper, we are interested in the numerical solution 
for linear equation: 
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where A and B are the matrix and vector associated to 
the linear approximation of differential equations with initial 
conditions or boundary conditions. 

While it is quite clear why diffusion approach is interest- 
ing (0 HOI 0) when we consider a sparse matrix with a 
very variable structure of in-degree/out-degree links (non- 
zero entries of the matrix A), the problem statement in the 
context of linear system associated to differential equation 
is very different and we try to analyse/explain if there may 



be an interest to consider a solution such as the D-iteration 
for those very regular sparse matrix. 

2. EXAMPLE OF HEAT EQUATION 

A typical linearized equation of the stationary heat equa- 
tion in 2D is of the form: 



T(n, m) = - (T(n - 1, m) + T(n, m 



1) 



+T(n+l,m) + T(n,m + l)) 

which can be obtained by the discretization of the basic 
equation: 



d 2 T d 2 T 
A.T(x, y) = ^ + ^ = 0. 



(2) 



inside the surface Q. (for instance, SI = [0, L x ] x [0, L y \) Then 
additive terms appear for the initial or boundary conditions 
(Dirichlet) on the frontier dQ (for instance, for x = or 
y — etc). 

For this family of equations, the linear dependences re- 
main local (such as averaging of neighbour positions' val- 
ues). From the intuitive point of view, the application of 
iterative methods on such a system is convergent, because 
the system is diagonally dominant and strictly dominant for 
the boundary positions, so that the spectral radius of the 
global system is strictly less than 1. As a consequence, com- 
pared to PageRank type equation with a damping factor 
d for which the spectral radius of the system is explicitly 
equal to d < 1 (per row), the convergence of the iterative 
scheme should be slower for the same size of vector N. In- 
deed, for the linear system we consider here, from the fluid 
diffusion point of view, instead of having (PageRank equa- 
tion) a fluid decreasing factor 1 — d per entry level or vector 
level iteration, the fluid can only disappear when it reaches 
the boundary positions (for instance, the positions where 
the temperature is imposed from a heat source). 

Now, in such a context, may the diffusion based iteration 
scheme be useful, faster? We don't pretend to give the an- 
swer in this paper, we'll just illustrate the differences and 
possible advantages through very simple examples. 

The D-iteration requires updating two vectors: the fluid 
vector F and the history vector H instead of a single vec- 
tor for the Gauss-Seidel. It has been explained that the 
H vector is the exact connection to the Gauss-Seidel itera- 
tions (cf. [TO]) when starting from empty initial condition 
Ho = [0, ..,0]. As it was the case for the iteration on the 
web graph associated matrix, the utility of F is to give us 
the exact information on the quantity of fluid that is sent. 
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Collection = averaging Diffusion 
Figure 1: Intuition: collection vs diffusion. 

Whereas with the web graph, we may wish to apply the 
diffusion to the node having the maximum amount of fluid 
(because the fluid that disappears is directly proportional 
to this quantity), with the matrix we have here, we may 
wish to find a way to push fluid to the boundary positions 
as quickly as possible, because inside Q, the fluid does not 
disappear. 

3. FIRST TESTS AND ALGORITHM ADAP- 
TATION 

As a first very simple illustration, we consider the iteration 
scheme: 

T(n,m) = - (T(n- l,m) +T(n,m - 1) (3) 

+T(n + l,m) + T(n,m + l)) (4) 

for n = 1, .., L x and m = 1, .., L y with the boundary condi- 
tion: 

T(n, m) = 0, if n — or n — L x 
T(n, m) = 0, if m = L y 
T(n, m) = 100, if m = 0. 

We first implemented the Gauss-Seidel iteration using the 
collection operations defined by: 

Collection of (n,m) : (averaging) 
T[n][m] : = 

. 25* (T [n-1] [m] +T [n] [m-1] +T [n+1] [m] +T [n] [m+1] ) ; 

when (n, m) £ dS} and the stopping condition: 
max |T(n, m) k — T(n, m) k ~ 1 < e. 

Then, compared its computation cost to the D-iteration ap- 
proach with the diffusion condition: 

Diffusion of (n,m), if: 
F [n] [m] > error ; 

Remark that if there is no position satisfying _F[n][m] > 
error, the condition becomes equivalent to the above stop- 
ping condition for Gauss-Seidel iteration. 

For L x — L y = 1000 and an error of e = 0.1 (let's say, 
we want to know the temperature with a precision of 0.1 
degree; one should be careful that the stopping condition 
defined on the norm msx -/./max does not guarantee that the 
obtained result is effectively at distance e to the limit), we 
found the results: 

• Gauss-Seidel (GS): 2.35057e+08 operations of collec- 
tions (over 236 cycles) in 8.3 seconds; 



Table 1: Impact of S on the computation cost. 
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Table 2: Impact of L y on the computation cost. 



• D-iteration (DI): 2.61681e+06 operations of diffusions 
(over 629 cycles) in 5.4 seconds. 

Diffusion of (n,m) : 
received := F[n][m]/4; 
T[n-1] [m] += received; 
T[n] [m-1] += received; 
T[n+1] [m] += received; 
T[n] [m+1] += received; 

In this case, the boundary condition is such that the heat 
diffusion is progressive toward the position y = L y . Observe 
that with D-iteration, we did much less operations (by factor 
100!), but this implied the diffusion condition tests done on 
more cycles (629) than for Gauss-Seidel (236 cycles). 

To reduce the diffusion condition test cost, we modified 
the diffusion condition (DC) as: 

DC: 

Diffusion of (n,m) , if: 
F[n][m] > error/delta; 

The results when varying S is shown in Table [T] we see a 
clear deterministic effect of S when S is a multiple of 4 (due 
to the factor 1/4 of the matrix). 4 seems to be here the 
optimal value. 

If our understanding is correct, we should have more gain 
when L y is increased (more stable value): as expected, this 
is confirmed in Table [2] 

While the number of diffusions stays constant for DI, for 
GS the number of collection operations increases propor- 
tionally to L y . The increase of the run time with DI is only 
due to the diffusion condition to be tested on a larger set of 
points. 

With this DC modification, we select the fluid diffusion 
only at positions where the variation is worth to apply dif- 
fusion: when diffusion is applied at position (n,m), F[n][m] 
is exactly the value by which ff[n][m] is increased. With 
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Table 3: Speeding-up the DC test. GS: 236 cycles, 
DI: 234 cycles (except for L y = 25: 193, 193). no 
op: without compiler optimization (option — 02 with 

9 + +)- 



the averaging computation, we are likely to do a lot of use- 
less computations in case certain points of space have con- 
verged under the desired precision. And testing this condi- 
tion would imply the same cost than applying the collection 
operation (because this would require the knowledge of the 
variation of the values of neighbour positions): this condi- 
tion test is exactly what we may do efficiently with F. 

Here, we observe that the cost of testing diffusion condi- 
tion (DC) is very important compared to the diffusion cost 
(see the collections/diffusions ratio compared to the run- 
time ratio). This is the main difference with the web graph: 
with web graphs (iterators required for diffusions or collec- 
tions), the dominant computation cost was the access time 
to the entries of the matrix with the iterators. Here, this is 
no more the dominant cost. Therefore, we need a specific 
optimization to reduce the DC tests. 

In order to reduce the diffusion condition test cost, we 
introduced a new (bool) state variable open: 

bool open [n_x] [n_y] ; 

which is set to true initially. Then, this state variable is 
updated as follows: 

• when (DC) is not true, we set its open state to false; 

• when an open position sends fluids to its neighbour 
positions, these neighbour positions are set to true. 

Then, we test this state value before testing the diffusion 
condition DC. The results obtained by this modification is 
illustrate in Table 

We see that there is a significant gain on the runtime. 
Finally, with the adaptation of DC and the introduction 
of open state variables, the D-iteration may be an interest- 
ing candidate to efficiently solve numerically the differential 
equations. 

4. COST ANALYSIS 



4.1 Cost decomposition 

In order to further understand the performance improve- 
ment and the computation cost structure, we introduce the 
following assumption: we assume that the runtime RT of 
the solution computation is of the form: 

RT = (a + P) x (L x x L y ) x nbjter + c, 

where a is the unitary computation cost of one cycle iter- 
ation, /3 the average computation cost for one diffusion or 
one collection, c a constant cost and nbjter the number of 
iterations of the cycles. 

To estimate the a, /3, c values, we first iterated on x, y 
without the collection operations: 

sum = 0.0; 
counter = 0; 

while ( counter < nb_iter ){ 
count er++ ; 

for (int x=0; x < L_x; x++){ 
for (int y=0; y < L_y; y++H 
if ( ! is_boundary(x,y) ) 
sum += T [x] [y] ; 

} 

} 

} 

where is_boundary(x,y) is a variable (bool) that return 
true if we are on the position with boundary conditions (or 
outside H). Then a is estimated by dividing the runtime by 
nbjter and L x x L y . For /3, we introduce collection opera- 
tions and then a + f3 is estimated by dividing the runtime 
by nbjter and L x x L y . 

Finally, c is approximated by the initialization time. For 
DI, we do the same evaluation with: 

sum = 0.0; 
counter = 0; 

while ( counter < nb_iter ){ 
counter++ ; 

for (int x=0; x < L_x; x++){ 
for (int y=0; y < L_y; y++){ 

if ( ! is_boundary(x,y) and open[x][y] ){ 
transit = F [x] [y] ; 
if ( transit > error/scale ){_ 
sum += T [x] [y] ; 

> 

} 

} 

} 

} 

adding the DC test for a and then adding the diffusion op- 
erations to estimate a + ft. The results are shown on Table 
II 

We see that a and /3 are quite stable. We see that the 
collection cost is roughly 2-3 times the iteration cost for GS 
(we observed this factor is the one that varies the most with 
nbjter). For DI, its iteration cost is more than 1.5 times 
the iteration cost for GS and the diffusion cost is about 2 
times the iteration cost (very stable with nbjter). 

The runtime for GS in Table U for 1000 x 1000 (no op) 
can be decomposed as 5s (iterations) + 15s (collections). In 
this case, DI converged with about same number of cycles 
but with 30 times less diffusions (than collections) . So based 
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Table 4: Computation cost. Without compiler opti- 
mization. 
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Table 5: Speeding-up the DC test. 
DI: 234 cycles. 



GS: 236 cycles, 



on the above model, we would have a runtime of: 7.5s for 
iterations and 15/30 x 2 = Is. But we observed 2.3s instead 
of 8.5s because most of time we don't need to test twice 
the boundary and open condition. We observed that only 
testing open condition is close (in this case) to the cost to 
test the boundary condition, but also the main difference 
comes from the fact that when the open condition is not 
true, we don't have to do any computation and this had 
in this case an impact of cost reduction by up to 3-4 (the 
iteration runtime for all values open [x] [y] set to false runs 
3-4 times faster). Therefore, the computation time obtained 
for DI would be: 5/4 + 1. 

Now, we could optimize a bit more the DC condition and 
the boundary position tests including the boundary position 
information in the state variable open (for instance using, 3 
states variable: for boundary position, 1 for not open to 
DC test, 2 for open to DC test). Results are shown in Table 
® 

4.2 The number of diffusions per position 

Figure [2] shows the number of diffusions applied per posi- 
tion y. We see that above y — 41, the diffusion is no more 
applied, whereas with GS, the collection operations need to 
be applied on all y positions at each cycle. 

This example is of course for illustration of a situation 
when all positions does not converge to the limit at the same 
speed. If L y is closer to 41, there is no more gain using 
DC, because all positions are converging almost at the same 
speed. 

Figure [3] shows the number of diffusions per position y 
when in the case L y = 1000 we added 10 random positions 
where T is imposed (T set to a random value between and 
1000): in this case, we obtained: 
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Figure 2: Nb of diffusions applied: per y position. 



DI: 687 cycles with 3.97e+07 diffusions for runtime of 
4.0s. 
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Figure 3: Nb of diffusions applied: per y position: 
adding 10 random boundary conditions inside Q. 

Figure [4] is an illustration of the limit for the boundary 
conditions of: T = 100 on the frontier and T = imposed 
at 10 random positions. 

5. CONCLUSION 

In this paper we addressed a first analysis of the poten- 
tial of the D-iteration when applied in the context of the 
numerical solving of differential equations. We showed that 
this context requires an adaptation of the D-iteration's fluid 
diffusion condition. The results are quite promising and we 
hope to investigate further this application case in the fu- 
ture. 
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